A Simple LNRE Model for Random Character Sequences

نویسنده

  • Stefan Evert
چکیده

This paper describes a population model for word frequency distributions based on the Zipf-Mandelbrot law, corresponding to the word frequency distribution induced by a random character sequence. The model, which has convenient analytical and numerical properties, is shown to be adequate for the description of language data extracted by automatic means from large text corpora. It can thus be used to study the problems faced by the statistical analysis of such data in the field of natural-language processing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

zipfR: Word Frequency Distributions in R

We introduce the zipfR package, a powerful and user-friendly open-source tool for LNRE modeling of word frequency distributions in the R statistical environment. We give some background on LNRE models, discuss related software and the motivation for the toolkit, describe the implementation, and conclude with a complete sample session showing a typical LNRE analysis.

متن کامل

zipfR: Word Frequency Modeling in R

We introduce the zipfR package, a powerful and user-friendly open-source tool for LNRE modeling of word frequency distributions in the R statistical environment. We give some background on LNRE models, discuss related software and the motivation for the toolkit, describe the implementation, and conclude with a complete sample session showing a typical LNRE analysis.

متن کامل

The Structural Model of Relationship between personality and neuropsychological Coolidge test and bulling behavior with Mediating Role of temperament – character and empathy in secondary school students

Aim: research on violent behaviors in schools, due to its implications on students educational performance, is among the priorities and instructional policies. Bulling in school is known as a global phenomenon and has a serious impact on students health. The aim of this study was to present the model of Relationship between personality and neuropsychological Coolidge test and  bulling behavior ...

متن کامل

Efficient Algorithms for Model-Based Motif Discovery from Multiple Sequences

We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet Σ. A motif G = g1g2 . . . gm is a string of m characters. Each background sequence is implanted a randomly generated approx...

متن کامل

Magnetic Properties and Phase Transitions in a Spin-1 Random Transverse Ising Model on Simple Cubic Lattice

Within the effective-field theory with correlations (EFT), a transverse random field spin-1 Ising model on the simple cubic (z=6) lattice is studied. The phase diagrams, the behavior of critical points, transverse magnetization,  internal energy, magnetic specific heat are obtained numerically and discussed for different values of p the concentration of the random transverse field.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004